Introduce new native backup provider (KNIB)#12758
Introduce new native backup provider (KNIB)#12758JoaoJandre wants to merge 4 commits intoapache:mainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #12758 +/- ##
============================================
- Coverage 17.92% 17.84% -0.09%
- Complexity 16176 16228 +52
============================================
Files 5949 6001 +52
Lines 534058 537981 +3923
Branches 65301 65650 +349
============================================
+ Hits 95742 95994 +252
- Misses 427560 431204 +3644
- Partials 10756 10783 +27
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@JoaoJandre just heads up - my colleagues have been working on an incremental backup feature for NAS B&R (using nbd/qemu bitmap tracking & checkpoints). We're also working on a new Veeam-KVM integration for CloudStack whose PR may may be out soon. My colleagues can further help review and advise on this. /cc @weizhouapache @abh1sar @shwstppr @sureshanaparti @DaanHoogland @harikrishna-patnala Just my 2cents on the design & your comments - NAS is more than just NFS, but any (mountable) shared storage such as CephFS, cifs/samba etc. Enterprise users usually don't want to mix using secondary storage with backup repositories, which is why NAS B&R introduced a backup-provider agnostic concept of backup repositories which can be explored by other backup providers. |
At the time of writing that part, I believe it was only NFS that was supported. I'll update the relevant part.
The secondary storage selector feature (introduced in 2023 by #7659) allows you to specialize secondary storages. This PR extended the feature so that you may also create selectors for backups. |
|
Hi Joao, This looks promising. Incremental backups, quick restore and file restore features have been missing from CloudStack KVM. I am having trouble understanding some of the design choices though:
|
Hello, @abh1sar
I don't see why we should force the coupling of backup offerings with backup repositories, what is the benefit?
The secondary storage also has both features. Although the capacity is not reported to the users currently.
The secondary storage selectors feature (introduced in 2023 through #7659) allows you to specialize secondary storages. Quoting from the PR description: Furthermore, my colleagues are working on a feature to allow using alternative secondary storage solutions, such as CephFS, iSCSI and S3, while preserving compatibility with features destined to NFS storages. This feature may be extended in the future to allow essentially any type of secondary storage. Thus, the flexibility for secondary storages will soon grow.
Using any other type of backup-level compression will be worse then using qemu-img compression. This is because when restoring the backup, we must have access to the whole backing chain. If we use other types of compression, we will have to decompress the whole chain before restoring. Using qemu-img, the backing files are still valid and do not need to be decompressed, we actually never have to decompress ever. This is the great benefit of using qemu-img. In any case, here is a brief comparison of using qemu-img with the zstd library and 8 threads and using the
Compression using qemu-img was a lot faster, with a bit smaller compression ratio. Furthermore, we have to consider that the qemu-img compressed image can be used as-is, while the other images must be decompressed, further adding to the processing time of backing up/restoring a backup.
The compression feature is optional, if you are using storage-level compression, you probably will not use backup-level compression. However, many environments do not have storage-level compression, thus having the possibility of backup-level compression is still very interesting.
The compression does not add any interaction with the SSVM.
I did not want to add dozens of parameters to the import backup offering API which are only really going to be used for one provider. This way, the original design of the API is preserved. Furthermore, you may note that the APIs are intentionally not called
There are two main issues with using bitmaps:
At the end of the day, this PR adds a new backup provider option for users. They will be free to choose the provider that best fits their needs. This is one of the reasons why it was done as a new backup provider; KNIB and other backup providers do not have to cancel each-other out. |
|
Hi @JoaoJandre
It has a big benefit for use cases where someone wants multiple tiers of backup storage with different cost and recovery characteristics. For example, long-term archival backups might go to cheap cold storage like tape, while backups that require faster recovery (better RTO) may use more expensive SSD-backed storage. You can have multiple backup offerings for different use cases and VMs can be attached to the required offerings.
Capacity is just for Admins for Backup Repository also, so that is not the issue.
It does have its benefits, but do keep in mind that the decompression cost will be paid by Reads. Also if someone is using a restored VM, they might see the physical size increase disproportionately to the actual data written due to decompression. It's still a useful feature and it's good that it is optional.
If there is possibility these parameters can be added for other providers they should be added to the existing API.
|
|
Hello, @abh1sar
You can have exactly that using this implementation. Essentially backup repositories and secondary storages are the same, with different names. Using selectors, you can have offerings that are tied to specific secondary storages, or you can have offerings that go into multiple storages, it was made to be flexible.
Again, you can have dedicated secondary storages using selectors.
While I was not the one that introduced the backup framework, looking at its design, it was clearly intended to have as little configuration as possible on the ACS side while leaving these details to the backup provider. If we add these parameters to the import backup offering API, I'm sure a lot of users will be confused when they do nothing for Veeam and Dell Networker. I did not intend to warp the original design of configuring the offerings on the provider side and only importing it to ACS. This is why I created the concept of native offerings. As with KNIB (and NAS) the provider is ACS, and thus the configurations can still be made on the "backup provider", and the import API will follow the same logic it always had.
A provider is native if the backup implementation is made by ACS. Thus, the current native providers would be NAS and KNIB. Veeam and Networker are external providers.
They are used to configure the details that would be configured in the backup provider if it was an external provider, such as Veeam for example.
I have yet to add it to the GUI, if you have any suggestions for the GUI design, you are free to give your opinion. The GUI for the native backup offerings will be added in the future.
Again, you absolutely can do this with KNIB. But you may also choose not to do it, its flexible. |
Description
This PR adds a new native incremental backup provider for KVM. The design document which goes into details of the implementation can be found on https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406622120.
The validation process which is detailed in the design document will be added to this PR soon.
The file extraction process will be added in a later PR.
This PR adds a few new APIs:
The
createNativeBackupOfferingAPI has the following parameters:namecompressfalsevalidatefalseallowQuickRestorefalseallowExtractFilefalsebackupchainsizecompressionlibraryzstdThe
deleteNativeBackupOfferingAPI has the following parameter:idA native backup offering can only be removed if it is not currently imported.
The
listNativeBackupOfferingsAPI has the following parameters:idcompressvalidateallowQuickRestoreallowExtractFileshowRemovedfalseThe
listBackupCompressionJobshas the following parametersidbackupidhostidexecutingparameter is implicitzoneidtypeStartingorFinalizingexecutingscheduledBy default, lists all offerings that have not been removed.
It also adds parameters to the following APIs:
isolatedparameter was added to thecreateBackupandcreateBackupScheduleAPIsquickRestoreparameter was added to therestoreBackup,restoreVolumeFromBackupAndAttachToVMandcreateVMFromBackupAPIshostIdparameter was added to therestoreBackupandrestoreVolumeFromBackupAndAttachToVMAPIs, which can only be used by root admins and only when quick restore is true.New settings were also added:
backup.chain.sizeknib.timeoutbackup.compression.task.enabledtruebackup.compression.max.concurrent.compressions.per.host5backup.compression.max.job.retries2backup.compression.retry.interval60backup.compression.timeout28800backup.compression.minimum.free.storage1backup.compression.coroutines1backup.compression.rate.limit0Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
Tests related to disk-only VM snapshots
Basic tests with backup
Using
backup.chain.size=3Interactions with other functionalities
I created a new VM with a root disk and a data disk for the tests below.
Configuration Tests
Compression Tests
Tests performed with an offer that provides compressed backups support
Tests with
restoreVolumeFromBackupAndAttachToVMTests with
restoreBackup